AdaDelay: Delay Adaptive Distributed Stochastic Optimization
نویسندگان
چکیده
We develop distributed stochastic convex op-timization algorithms under a delayed gradi-ent model in which server nodes update pa-rameters and worker nodes compute stochas-tic (sub)gradients. Our setup is motivated bythe behavior of real-world distributed com-putation systems; in particular, we analyzea setting wherein worker nodes can be dif-ferently slow at different times. In contrastto existing approaches, we do not impose aworst-case bound on the delays experiencedbut rather allow the updates to be sensitiveto the actual delays experienced. This sen-sitivity allows use of larger stepsizes, whichcan help speed up initial convergence with-out having to wait too long for slower ma-chines; the global convergence rate is stillpreserved. We experiment with different de-lay patterns, and obtain noticeable improve-ments for large-scale real datasets with bil-lions of examples and features.
منابع مشابه
AdaDelay: Delay Adaptive Distributed Stochastic Convex Optimization
We study distributed stochastic convex optimization under the delayed gradient model where theserver nodes perform parameter updates, while the worker nodes compute stochastic gradients. Wediscuss, analyze, and experiment with a setup motivated by the behavior of real-world distributedcomputation networks, where the machines are differently slow at different time. Therefore, we ...
متن کاملMarket Adaptive Control Function Optimization in Continuous Cover Forest Management
Economically optimal management of a continuous cover forest is considered here. Initially, there is a large number of trees of different sizes and the forest may contain several species. We want to optimize the harvest decisions over time, using continuous cover forestry, which is denoted by CCF. We maximize our objective function, the expected present value, with consideration of stochastic p...
متن کاملConsidering Stochastic and Combinatorial Optimization
Here, issues connected with characteristic stochastic practices are considered. In the first part, the plausibility of covering the arrangements of an improvement issue on subjective subgraphs is studied. The impulse for this strategy is a state where an advancement issue must be settled as often as possible for discretionary illustrations. Then, a preprocessing stage is considered that would q...
متن کاملThe Convergence of Stochastic Gradient Descent in Asynchronous Shared Memory
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the optimization backbone for training several classic models, from regression to neural networks. Given the recent practical focus on distributed machine learning, significant work has been dedicated to the convergence properties of this algorithm under the inconsistent and noisy updates arising from...
متن کاملAdaptive Multi-Agent Systems for Constrained Optimization
Product Distribution (PD) theory is a new framework for analyzing and controlling distributed systems. Here we demonstrate its use for distributed stochastic optimization. First we review one motivation of PD theory, as the information-theoretic extension of conventional full-rationality game theory to the case of bounded rational agents. In this extension the equilibrium of the game is the opt...
متن کامل